Audio-visual synthesis of talking faces from speech production correlates

نویسندگان

  • Takaaki Kuratate
  • Kevin G. Munhall
  • Philip Rubin
  • Eric Vatikiotis-Bateson
  • Hani Yehia
چکیده

This paper presents technical refinements and extensions of our system for correlating audible and visible components of speech behavior and subsequently using those correlates to generate realistic talking faces. Introduction of nonlinear estimation techniques has improved our ability to generate facial motion either from the speech acoustics or from orofacial muscle EMG. Also, preliminary evidence is given for the strong correlation found 3D head motion and fundamental frequency (F0). Coupled with improved methods for deriving facial deformation parameters from static 3D face scans, more realistic talking faces are now being synthesized.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Audiovisual Speech Synchrony Measure: Application to Biometrics

Speech is a means of communication which is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech, and more specifically techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, trans...

متن کامل

Audio-visual speech synthesis for finnish

We describe our Finnish audio-visual speech synthesizer, its evaluation and discuss possible improvements. We have combined a three dimensional facial model with a commercial audio text-to-speech synthesizer. The visual speech is based on a letter-to-viseme mapping and the animation is created by linear interpolation between the visemes. An intelligibility test was run to quantify the benefit o...

متن کامل

Subjective Evaluation for HMM-Based Speech-To-Lip Movement Synthesis

An audio-visual intelligibility score is generally used as an evaluation measure in visual speech synthesis. Especially an intelligibility score of talking heads represents accuracy of facial models[1][2]. The facial models has two stages such as construction of real faces and realization of dynamical human-like motions. We focus on lip movement synthesis from input acoustic speech to realize d...

متن کامل

Lifelike Talking Faces for Interactive Services

Lifelike talking faces for interactive services are an exciting new modality for man–machine interactions. Recent developments in speech synthesis and computer animation enable the real-time synthesis of faces that look and behave like real people, opening opportunities to make interactions with computers more like face-to-face conversations. This paper focuses on the technologies for creating ...

متن کامل

Audio-visual speech asynchrony modeling in a talking head

An audio-visual speech synthesis system with modeling of asynchrony between auditory and visual speech modalities is proposed in the paper. Corpus-based study of real recordings gave us the required data for understanding the problem of modalities asynchrony that is partially caused by the coarticulation phenomena. A set of context-dependent timing rules and recommendations was elaborated in or...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999